The Age of Drag Race: Is Rupaul’s Drag Race biased towards younger queens?

 

Research Question

Rupaul’s Drag Race is a reality competition show for drag queens in which contestants compete to be America’s next Drag Super Star. Amongst the many topics of debate surrounding the show is whether it has a bias towards young queens.

This project visualises the spread of the ages of Drag Race contestants by season, as well as the ages of the queens that won each season. This will allow a general picture of the age range of the queens as well as the identification of whether there is a historical trend in it.

 

Data Origins

This data is provided as part of the package dragracer1. The CRAN archive for the package is https://cran.r-project.org/package=dragracer. The github page is https://github.com/svmiller/dragracer

A description of the project is available on the creator, Steven V. Miller’s blog post: “An Empirical Analysis of RuPaul’s Drag Race Contestants.” The data was compiled by scraping the Rupaul’s Drag Race Wiki

Because it is a package, the first time you run this script, you must run install.packages(dragracer) first.

The dragracer package includes three datasets: rpdr_contestants, which provides information about the contestants, rpdr_contep, which describes each contestant’s performance in each episode, and rpdr_ep, which gives characteristics of each episode. This analysis and visualisation use the rdpr_contestants and rpdr_contep datasets.

The relevant datasets have been exported to csv files using this code: write.csv(df, "filepath\\filename.csv", row.names = FALSE). They are available in the data folder.

#loading required packages with pacman package manager

library(pacman)
pacman::p_load(ggplot2, plotly, magrittr, tidyverse)
library(dragracer) #when loaded returns a catchphrase from the show, omitted from pacman for flair
## The library is now open
#previewing datasets

head(rpdr_contestants)
## # A tibble: 6 x 5
##   season contestant          age dob        hometown                
##   <chr>  <chr>             <dbl> <date>     <chr>                   
## 1 S01    BeBe Zahara Benet    28 1981-03-20 Minneapolis, Minnesota  
## 2 S01    Nina Flowers         34 1974-02-22 Bayamón, Puerto Rico    
## 3 S01    Rebecca Glasscock    26 1983-05-25 Fort Lauderdale, Florida
## 4 S01    Shannel              26 1979-07-03 Las Vegas, Nevada       
## 5 S01    Ongina               26 1982-01-06 Los Angeles, California 
## 6 S01    Jade                 32 1984-11-18 Chicago, Illinois
head(rpdr_contep)
## # A tibble: 6 x 11
##   season  rank missc contestant episode outcome eliminated participant minichalw
##   <chr>  <dbl> <dbl> <chr>        <dbl> <chr>        <dbl>       <dbl>     <dbl>
## 1 S01        1     0 BeBe Zaha~       1 SAFE             0           1         0
## 2 S01        2     1 Nina Flow~       1 WIN              0           1         0
## 3 S01        3     0 Rebecca G~       1 LOW              0           1         0
## 4 S01        4     0 Shannel          1 SAFE             0           1         0
## 5 S01        5     0 Ongina           1 HIGH             0           1         0
## 6 S01        6     0 Jade             1 SAFE             0           1         0
## # ... with 2 more variables: finale <dbl>, penultimate <dbl>

Data Preparation

#create a data frame including the contestant's ages, season, and final ranking in the competition

combined <- left_join(rpdr_contestants, rpdr_contep)
## Joining, by = c("season", "contestant")
ranked <- combined %>% 
          select(c("season", "contestant","rank", "age")) %>%
          distinct()


#create a data frame of winners only

winners <- subset(ranked, rank == "1")

#preview data ranked data frame

head(ranked)
## # A tibble: 6 x 4
##   season contestant         rank   age
##   <chr>  <chr>             <dbl> <dbl>
## 1 S01    BeBe Zahara Benet     1    28
## 2 S01    Nina Flowers          2    34
## 3 S01    Rebecca Glasscock     3    26
## 4 S01    Shannel               4    26
## 5 S01    Ongina                5    26
## 6 S01    Jade                  6    32
# summary statistics

stats <- tapply(ranked$age, ranked$season, summary)

head(stats, 3)
## $S01
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##      26      26      32      31      34      39 
## 
## $S02
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   21.00   25.00   27.00   27.58   29.25   37.00 
## 
## $S03
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   23.00   26.00   28.00   28.23   29.00   36.00

Visualisation

# violin and scatterplots of all contestants' ages with crossbar of mean

age_all <- ggplot(ranked, aes(x = factor(season), y = age)) +
    geom_violin(aes(color = season, fill = season), alpha = 0.5) +
    stat_summary(fun = "mean",
                 geom = "crossbar", 
                 width = 0.3) +
    geom_point(aes(color = season, fill = season),
                   position = position_jitterdodge(jitter.width = .1, dodge.width = 0), 
                   shape=21, size = 1) +
  
# graph aesthetics   
  
    labs(x = element_blank()) +
    ggtitle("Rupaul's Drag Race Contestant Ages") +
    theme(legend.position="none",  
    strip.background=element_blank(),
    axis.text.x=element_text(size=10,color="black"),
    axis.text.y=element_text(size=10,color="black"),
    panel.background = element_blank(),
    panel.grid.major = element_blank(), 
    panel.grid.minor = element_blank(),
    panel.border = element_rect(size= .75,fill = NA,color = "black"),
    plot.margin = unit(c(.5, 0.5, .5, .5), "cm"))
  
#adding interactive overlay of winners' ages - hover over black dot

age_plot <- ggplotly(age_all + geom_point(data = winners, aes(x = factor(season))))


age_plot <- age_plot %>%
            style(hoverinfo = "skip", traces = 1:40) %>%
            style(text = paste("Winner's age:", winners$age))

age_plot

This visualisation shows the distribution of ages of Drag Race contestants by season, including points for individual contestants’ ages, a crossbar showing the mean age of for that season, and a point showing the age of the winner, which displays the contestant’s age upon hovering.

 

Summary, Discussion

Drag Race contestants have ranged in age from 21 to 52. The average age of all participants through season 13 is 29. Winners average age is 27. There does not seem to be a significant trend in age across seasons. The relatively young average age of all contestants and winners upholds the commonly accepted belief that drag is a young girl’s game. As beloved Drag Race contestant Katya Zamolodchkava has said, “There’s nothing more depressing than an older drag queen that doesn’t want to do it.”(2)

 

In the future, I would like to analyse and visualise this data set further to tackle additional debates in the fandom. One visualisation I’d like to do would be of the contestants’ hometowns. Queens come from all over the country, but there are also biases towards New York and L.A., with New York queens being particularly vocal about the excellence and dominance of their scene. Additionally, Puerto Rico features prominently for consistently fielding at least one queen each season. Thus, I would be interested in generating a map, but for this project, I could not find an aesthetically pleasing and practical US grid map. I was excited about applying a specific bit of knowledge from class in using the code separate(rpdr_contestants, col = hometown, c("town", "state"), sep = ",") to divide data combined into one column, ie hometown, into separate columns, ie, town and state.

       

References

  1. Steve Miller (2021). dragracer: Data Sets for RuPaul’s Drag Race. R package version 0.1.6. https://CRAN.R-project.org/package=dragracer
  2. Mattel, Trixie & Zamolodchikova, Katya. (2020, Feb 12). UNHhhh Ep. 114: I’m in My 30s. Youtube. https://www.youtube.com/watch?v=OilvWfWkaaw